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METHODS AND APPARATUSES FOR 
SEARCHING BOTH EXTERNAL PUBLIC DOCUMENTS 
AND INTERNAL PRIVATE DOCUMENTS 
IN RESPONSE TO A SINGLE SEARCH REQUEST 

5 FIELD OF THE INVENTION 

The invention relates to data processing. More specifically, the invention relates 
to searching and/or retrieval of public external documents and private internal documents 
that have been unconsciously captured in response to a single search request. 

BACKGROUND OF THE INVENTION 
10 Internet portals (or gateways) are specialized World Wide Web sites that provide 

a starting site for users accessing the Web. Typically, these portals provide globally 

useful content and searching capabilities. Portals primarily provide value by helping 

users find and use Web content. 

Typical services offered by portal sites include a directory of Web sites, a facility 
1 5 to search for other sites, news, weather information, e-mail, stock quotes, phone and map 

information, etc. However, portals only provide the ability to search documents that have 

been made public by the publishers of the documents. Many documents are not available 

to these portals. 

Many organizations have a document management policy for managing and 
20 maintaining internal private documents. Because these documents are intended to be 

private, they are not available to the public for searching via portals. Thus, if a searching 
party wishes to search both public and private documents, at least two search requests are 
required and at least two search results must be evaluated. Generating multiple search 
requests and evaluation of multiple search results can be time consuming and inefficient. 
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Therefore, what is needed is an improved scheme for searching public and private 
documents. 
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SUMMARY OF THE INVENTION 

Methods and apparatuses for searching both external public documents and 
internal private documents in response to a single search request is described. A first 
search request is generated automatically with an electronic device in response to an 
5 original search request. The first search request to cause a search to be performed on 
electronic documents unconsciously captured by a local network device. The search of 
the electronic documents unconsciously captured is performed according to search 
parameters of the original search request. A second search request is generated 
automatically with the electronic device in response to the original search request. The 
10 second search request causes a search to be performed on electronic documents available 
via a network portal of an external network according to the search parameters of the 
original search request. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The invention is illustrated by way of example, and not by way of limitation in the 
figures of the accompanying drawings in which like reference numerals refer to similar 
elements. 

Figure 1 illustrates one embodiment of a file management system. 

Figure 2A illustrates one embodiment of unconscious capture using the MIME 

format. 

Figure 2B illustrates one embodiment of unconscious capture in an FMA 
environment. 

Figure 2C illustrates one embodiment of the document storage process in a FMA 
environment. 

Figure 3 illustrates one embodiment of a block diagram of a portal appliance. 

Figure 4 is a flow diagram of one embodiment of a process for searching public 
documents and private documents in response to a single request. 

Figure 5 is one embodiment of a flow diagram of a processor for capturing public 
content at predetermined times for unconscious capture. 
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DETAILED DESCRIPTION 

Methods and apparatuses for searching both external public documents and 
internal private documents in response to a single search request is described. In the 
following description, for purposes of explanation, numerous specific details are set forth 
5 in order to provide a thorough understanding of the invention. It will be apparent, 
however, to one skilled in the art that the invention can be practiced without these 
specific details, hi other instances, structures and devices are shown in block diagram 
form in order to avoid obscuring the invention. 

Reference in the specification to "one embodiment" or "an embodiment" means 
10 that a particular feature, structure, or characteristic described in connection with the 

embodiment is included in at least one embodiment of the invention. The appearances of 
the phrase "in one embodiment" in various places in the specification are not necessarily 
all referring to the same embodiment. 

Some portions of the detailed descriptions which follow are presented in terms of 
15 algorithms and symbolic representations of operations on data bits within a computer 

memory. These algorithmic descriptions and representations are the means used by those 
skilled in the data processing arts to most effectively convey the substance of their work 
to others skilled in the art. An algorithm is here, and generally, conceived to be a self- 
consistent sequence of steps leading to a desired result. The steps are those requiring 
20 physical manipulations of physical quantities. Usually, though not necessarily, these 
quantities take the form of electrical or magnetic signals capable of being stored, 
transferred, combined, compared, and otherwise manipulated. It has proven convenient 
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at times, principally for reasons of common usage, to refer to these signals as bits, values, 
elements, symbols, characters, terms, numbers, or the like. 

It should be borne in mind, however, that all of these and similar terms are to be 
associated with the appropriate physical quantities and are merely convenient labels 
5 applied to these quantities. Unless specifically stated otherwise as apparent from the 
following discussion, it is appreciated that throughout the description, discussions 
utilizing terms such as "processing" or "computing" or "calculating" or "determining" or 
"displaying" or the like, refer to the action and processes of a computer system, or similar 
electronic computing device, that manipulates and transforms data represented as 

10 physical (electronic) quantities within the computer system 1 s registers and memories into 
other data similarly represented as physical quantities within the computer system 
memories or registers or other such information storage, transmission or display devices. 

The present invention also relates to apparatus for performing the operations 
herein. This apparatus may be specially constructed for the required purposes, or it may 

15 comprise a general purpose computer selectively activated or reconfigured by a computer 
program stored in the computer. Such a computer program may be stored in a computer 
readable storage medium, such as, but is not limited to, any type of disk including floppy 
disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories 
(ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical 

20 cards, or any type of media suitable for storing electronic instructions, and each coupled 
to a computer system bus. 

The algorithms and displays presented herein are not inherently related to any 
particular computer or other apparatus. Various general purpose systems may be used 
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with programs in accordance with the teachings herein, or it may prove convenient to 
construct more specialized apparatus to perform the required method steps. The required 
structure for a variety of these systems will appear from the description below. In 
addition, the present invention is not described with reference to any particular 
5 programming language. It will be appreciated that a variety of programming languages 
may be used to implement the teachings of the invention as described herein. 

Methods and apparatuses for both public documents and private documents in 
response to a single search request are disclosed. Public documents are electronic 
documents that are made available to large groups of people in by the publisher of the 

10 document. An example of a public document is a World Wide Web page. Private 

documents are documents that have restricted access. An example of a private document 
is a document that generated by members of an organization and is available only to 
members of the organization. As described in greater detail below, a portal appliance or 
other device can be used to search both public and authorized private electronic 

15 documents in response to a single search request thereby improving the results of the 
search and/or reducing the number of searches required to find the desired material. 

System Overview 

Figure 1 illustrates one embodiment of a file management system. Client 1 10 
represents a general purpose digital computer coupled to network 100. Network 100 may 
20 represent a local area network (LAN), an intranet, the Internet, or any other 

interconnected data path across which multiple devices may communicate. Also 
connected to network 100 is facsimile machine 120, copier 125, printer 130, scanner 135, 
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data storage device 140, server 145, file management appliance ("FMA") 150, and portal 
appliance ("PA") 160. 

Facsimile machine 120 is connected to network 100 and represents a device capable 
of transmitting and receiving data such as text and images over a telephone or other 
5 communications line ("faxing"). In one embodiment, facsimile machine 120 may transmit 
text and images originating in printed form, or in another embodiment, facsimile machine 
120 may transmit electronic data originating from any number of devices connected to 
network 100. Similarly, in one embodiment, facsimile machine 120 may print a hard copy 
of the received data, or in another embodiment, facsimile machine 120 may forward the 

10 received data to any number of devices connected to network 100. 

Copier 125 represents a device capable of reproducing text and images. In one 
embodiment, copier 125 is a photocopier that reproduces printed text and images, 
whereas in another embodiment copier 125 is a photocopier that reproduces data received 
from any number of devices connected to network 100. 

15 Printer 130 represents a device capable of converting electronic data into printed 

text and images, whereas scanner 135 represents a device capable of converting printed 
text and images into electronic data. In one embodiment, facsimile machine 120, 
photocopier 125, printer 130, and scanner 135 are each separate and distinct devices 
connected to network 100. In another embodiment, a multifunction device may replace 

20 any combination of these devices. Any number of devices may be omitted from or added 
to network 100 without parting from the spirit and scope of the present invention. 

In one embodiment, data storage device 140 is also coupled to network 100. In 
one embodiment, data storage device 140 represents a removable storage medium such as 
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a CD-ROM, DVD-ROM, DVD-RAM, DVD-RW, magnetic tape or other storage 
medium. In an alternative embodiment, data storage device 140 represents a non- 
removable storage medium such as a hard or fixed disk drive. In one embodiment, data 
storage device 140 is an archiving device. 
5 Server 145 represents a general purpose digital computer connected to network 

100 and is configured to provide network services to other devices connected to network 
100. In one embodiment, server 145 provides file sharing and printer services to network 
100. In another embodiment, server 145 is a Web server that provides requested 
hypertext markup language (HTML) pages or files over network 100 to requesting 

10 devices. In yet another embodiment, server 145 is a server capable of providing 
configuration services to network 100. 

FMA 150 is a file management appliance that is connected to network 100. In 
one embodiment, FMA 150 provides document capture and indexing services. In one 
embodiment, FMA 150 is a device capable of providing configuration services in 

15 addition to document capture and indexing services to network 100. In one embodiment, 
FMA 150 is not directly connected to any device, but rather is communicatively coupled 
to other devices through network 100. FMA 150 is capable of publishing its presence to 
other devices on network 100 using the HTTP or other protocols. 

Automatic document capture (or "unconscious capture"), which is discussed more 

20 fully below, is the process by which one device, requests an archiving device, such as data 
storage device 140, to archive a document. In one embodiment, FMA 150 is the requesting 
device; however, other devices can also request archival of documents. Greater detail with 
respect to capture of documents that are copied, faxed, printed and other documents as well 
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as document management are disclosed in U.S. Patent No. 08/754,721, entitled 
"AUTOMATIC AND TRANSPARENT DOCUMENT ARCHIVING" filed November 21, 
1996, and U.S. Patent No. 5,893,908, entitled "DOCUMENT MANAGEMENT 
SYSTEM," issued April 13, 1999, both of which are incorporated by reference and 
assigned to the corporate assignee of the present U.S. patent application. 

A document may be composed of many distinct files of varying types, each 
representing at least the partial content of the document. A print job created on client 1 10 
and intended for printer 130 could be captured, for example, as a thumbnail image, a 
postscript file, a portable document format (PDF) file, and an ASCII file containing 
extracted text. Additionally, FMA 150 is able to process multiple image file formats 
including the joint photographic experts group format (JPEG), graphics interchange format 
(GIF), and tagged image file format (TIFF) to name just a few. In one embodiment, each 
unique file type is represented by a corresponding unique file extension appended to the 
file's name. For example, a portable document format file may be represented as: 
filename.pdf, whereas a thumbnail image may be represented as: filename. thumb. 

In one embodiment, FMA 150 is able to interpret compound filename extensions. 
For example, a thumbnail image file that contains images in a tagged image file format may 
be represented as/j/ename.thumb.tiff. In one embodiment, FMA 150 uses the page number 
of the document as the filename. In such a manner, a document may be represented by 
multiple files located in the same directory, each representing a different page of the 
document as reflected by the filename. For example, "01.thumb.jpg" would represent a 
thumbnail image of page one in joint photographic experts group format. Similarly, 
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"12.thumb.tiff ' would represent a thumbnail image of page twelve in tagged image file 
format. 

FMA 150 may index data captured from various devices connected to network 
100 including printer 130, facsimile machine 120, client 1 10 and scanner 135. In one 
embodiment, facsimile machine 120 captures data over a telephone line and subsequently 
sends at least part of the received data to FMA 150 over network 100. In another 
embodiment, data sent from client 1 10 to facsimile machine 120 over network 100 is 
transparently (unbeknownst to the device) captured and at least part of the data is routed 
to FMA 150 for indexing. 

In an alternative embodiment, facsimile machine 120 is located internal to client 
110 thereby eliminating the need for client 1 10 to send data over network 100. In such an 
embodiment, FMA 150 nonetheless receives at least part of the captured data. In one 
embodiment, FMA 150 receives bibliographic-type data extracted from the document. In 
one embodiment, data received from facsimile machine 120 is composed in TIFF format, 
whereas data received from client 1 10 may retain its original format upon transfer. 

The FMA capture process similarly applies to other devices connected to network 
100 such as scanner 135 and copier 125. In one embodiment, if optical character 
recognition ("OCR") is performed on a scanned or copied document, FMA 150 creates 
two special OCR-related files. In one embodiment, "contents.txt" and "contents.pdf are 
created and used by FMA 150 to index the full text of the document and return page 
images as a document file respectively. 

In one embodiment, FMA 150 is capable of providing the same functionality as 
any one or more of the devices on network 100, thereby eliminating the need for these 
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additional specialized devices. In one embodiment, however, FMA 150 is implemented 
as a thin server containing enough hardware and software to support document capture 
and indexing over network 100. 

PA 160 is also coupled to network 100. In one embodiment, PA 160 supports 
5 searches of captured internally available, or private, documents stored, for example, on 
data storage device 140 as well as externally available, or public, documents available 
from network 170. In an alternative embodiment, the functionality of PA 160 is 
incorporated into FMA 150 or another device (e.g., client 1 10, server 145) coupled to 
network 100. In one embodiment network 170 is the Internet; however, network 170 can 

10 be any network of electronic devices. 

In one embodiment, PA 160 operates with one or more Internet portals (e.g., 
yahoo.com, excite.com, go.com) to provide searching capability of external documents. 
Any Internet portal or any portal to an external network (e.g., a portal to a second 
network controlled by the organization that controls network 100) can also be used by PA 

15 160 to provide searches of external documents. In one embodiment, the portal controls 
the content presented to the searching party by providing "gaps" in the search report that 
can be "filled" by PA 160 to present a unified search result to the searching party. In an 
alternative embodiment, PA 160 controls the content presented by the searching party by 
generating search requests to one or more portals as well as a search of data storage 

20 device 140 to search private documents. PA 160 compiles the search results and presents 
the results to the searching party. 
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Unconscious Capture 

Unconscious capture is an operation in which a device (e.g., FMA 150) requests 
an archiving device (e.g., data storage device 140) to archive a document. In general, 
unconscious capture refers to FMA 150, or other device, automatically capturing 
documents processed by network 100 or devices coupled to network 100 without user 
intervention. In one embodiment, a user can optionally prevent capture of one or more 
documents or modify which documents are automatically captured. This may be 
performed by operating a selection unit or device (e.g., pressing a button). 

Unconscious capture can be performed by any network entity or device. In one 
embodiment, unconscious capture utilizes standard Internet protocols and allows the 
capture of multiple files associated with a single document. In another embodiment, 
simultaneous capture of multiple documents is supported. 

In one embodiment, a document is represented by a directory containing one 
metadata file and at least one data file. The actual name of the document directory is not 
important during unconscious capture as the name of the document is not stored as part of 
the directory system, but is instead stored within the metadata file. In one embodiment, 
the name of the document is stored in the metadata file using a document serial number. 
In one embodiment, the capture date is used for the name of the document directory. 

In one embodiment, the capture protocol is an implementation of the Internet File 
Transfer Protocol (FTP). In one embodiment, documents are captured either as 
multipurpose Internet mail extension (MIME) files in the default FTP directory, or as 
subdirectories of the default directory. Other capture formats can also be used. 
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Figure 2A illustrates one embodiment of unconscious capture using the MIME 
format. A capturing device creates a MIME multi-part file, including all content files and 
a metadata file, 210. The capturing device then attempts to establish an anonymous FTP 
session with the destination device, 215. Once an FTP session is established, the 
capturing device determines a filename that is a unique on the destination device, 220 and 
attempts to transfer the file to the destination device, 225. If the transfer fails, the 
capturing device obtains a new filename and attempts the file transfer again. The capture 
is complete upon a successful file transfer, 230. 

Figure 2B illustrates one embodiment of unconscious capture in an FMA 
environment. The capturing device establishes an anonymous FTP session with the 
destination device, 235. Once the FTP session is established, the capturing device 
determines what it assumes to be a unique directory name on the destination device, 240. 
Once a directory name is determined, the capturing device attempts to create a directory 
with that name on the destination device, 245. If the attempt to create the directory is 
unsuccessful, whether due to a duplicate directory name or otherwise, the capturing 
device determines another directory name and attempts to create the directory again. 

If, however, the capturing device successfully creates the directory on the 
destination device, 250, the capturing device then copies the content file or files to the 
newly created directory, 255. The capturing device also creates a metadata file, 260, 
which is then sent to the FMA device, 265 to complete the process. 

Figure 2C illustrates one embodiment of the document storage process in a FMA 
environment. In one embodiment, the document directory is represented by 
"yyyy/mm/dd" where yyyy represents the year in which the document was created, mm 
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represents the ordinal month in which the document was created, and dd represents the 
day of the month in which the document was created. Other date formats and/or storage 
ordering can also be used. 

During the document storage process, the FMA creates appropriate directories, 
5 moves the document to the appropriate directory, and updates the master list. The 

metadata file of the document to be stored is accessed and information from its "Capture 
date" field is retrieved, 270. If the document's "Capture date" or even the metadata file 
does not exist, then the current system time is obtained and used as the document's 
"Capture date " 274. If, however, the document's "Capture date" does exist, the system 

10 determines whether an appropriately named directory exists. 

The system determines whether a directory exists as reflected by the appropriate 
year, 276. If a directory reflecting the appropriate year does not exist, the system creates 
such a directory, 278. If a directory reflecting the appropriate year does exist, the system 
then checks whether a directory reflecting the appropriate month exists within that year 

15 directory, 280. If the appropriate month directory does not exist within the year 

directory, the system creates a month directory within the year directory, 282. If the 
appropriate year and month directories exist, the system finally checks whether the 
appropriate day directory exists within the nested year/month directory, 284. 

If the day directory does not exist, the system creates the appropriate day 

20 directory within the year/month directory, 286. If a directory reflecting the appropriate 
year, month and day already exists, the system creates a new document directory name 
into which the document will be stored. In one embodiment, the system generates a four- 
digit random number that gets appended to the end of the existing document directory 
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name, 288. Once a unique document directory name is established, 286 and 288, the 
document is moved to that directory, 290 and the master document list is updated to 
reflect the document's new location, 292. 

Overview of a Portal Appliance 

Figure 3 illustrates one embodiment of a block diagram of a portal appliance. PA 
160 includes bus 310 or other communication device to communicate information, and 
processor 320 coupled to bus 3 10 to process information. While PA 160 is illustrated 
with a single processor, PA 160 can include multiple processors and/or co-processors. 
PA 160 further includes random access memory (RAM) or other dynamic storage device 
350 (referred to as main memory), coupled to bus 310 to store information and 
instructions to be executed by processor 320. Main memory 350 also can be used to store 
temporary variables or other intermediate information during execution of instructions by 
processor 320. 

PA 160 also includes read only memory (ROM) and/or other static storage device 
330 coupled to bus 310 to store static information and instructions for processor 320. 
Storage device 370 is coupled to bus 310 to store information and instructions. Storage 
device 370 such as a magnetic disk or optical disc and corresponding drive can be coupled 
to PA 160. 

PA 160 can also be coupled via bus 3 10 to I/O devices 360, such as a cathode ray 
tube (CRT) or liquid crystal display (LCD), to display information to a user, and 
alphanumeric input device to communicate information and command selections to 
processor 320. Another type of I/O device is a cursor control, such as a mouse, a trackball, 
or cursor direction keys to communicate direction information and command selections to 
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processor 320 and to control cursor movement on the display. Additional and/or different 

I/O devices can also be coupled to bus 310. 

Network interface 340 provides an interface between PA 160 and network 170. 

Similarly, network interface 345 provides an interface between PA 160 and network 100. 
5 In one embodiment, network interface 340 and network interface 345 are network interface 

cards (NICs), which are known in the art; however, any interface that can provide PA 160 

with access to multiple networks can be used. 

One embodiment of the invention is related to the use of PA 160 to perform 

searches on both network 100 and network 170. According to one embodiment, the 
10 searches are performed by PA 160 in response to processor 320 executing sequences of 

instructions contained in main memory 350. Instructions are provided to main memory 

350 from a storage device, such as magnetic disk, a read-only memory (ROM) integrated 

circuit (IC), CD-ROM, DVD, via a remote connection (e.g., over a network), etc. In 

alternative embodiments, hard- wired circuitry can be used in place of or in combination 
15 with software instructions to implement the present invention. Thus, the present 

invention is not limited to any specific combination of hardware circuitry and software 

instructions. 

Document Searching 

Figure 4 is a flow diagram of one embodiment of a process for searching public 
20 documents and private documents in response to a single request. The portal appliance 
receives a search request, 410. The search request can be in the form of a boolean text 
search, a plain language text search, or any other appropriate format. 
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The portal appliance sends the search request to one or more portals, 420. In one 
embodiment, the portal appliance performs any necessary request reformating such that the 
portal search requests are recognized by the portal receiving the search request. Similarly, 
the portal appliance sends the search request to a networked FMA or other device to 
5 perform searches on unconsciously captured documents, 430. The portal search(es) and the 
captured document search can be performed in parallel or sequentially. 

Results from the portal search are received, 440. Similarly, results from the 
captured document search are received, 450. The search results can be received in parallel 
or sequentially. The portal appliance combines the search results 460. 

10 In one embodiment, the portal search report indicates where the captured document 

search report is to be inserted. This embodiment provides the portal with control of the 
style and content of the search report. If the search results are provided in Hypertext 
Markup Language (HTML) format, for example, an anchor (<A>) tag can be used to 
indicate where the captured document search report is to be inserted. For example 

15 <A HREF="search:foo" MODE="Table" WIDTH = 300> 

indicates that the tag should be replaced with a captured document search for the word 
"foo" presented in a table with a width of 300 pixels. Of course, any other format can also 
be used. 

In an alternative embodiment, the portal appliance controls the style and content of 
20 the search report by issuing search reports to one or more portals and to the FMA to search 
the captured documents. In one embodiment, the portal appliance generates Hypertext 
Transfer Protocol (HTTP) requests to the portals, which can be, for example, Common 
Gateway Interface (CGI) programs that perform the requested searches. The portal 
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appliance receives the multiple search requests and combines the search requests into an 
appropriate format. 

The portal appliance outputs the search results in response the combination of the 
search results, 470. In one embodiment, the search results are presented as an HTML 
5 document having links to the documents identified by the searches. Other formats, for 
example, Extensible Markup Language (XML) can also be used. If a user selects one of 
the links the document is retrieved from either a private source coupled to a network 
belonging to the organization or from an external public source. 

Thus, a portal appliance or other device can integrate content from a portal with 

10 content from unconsciously captured documents to provide a searching party with a unified 
search result. The unified search result provided by the portal appliance allows a searching 
party to search both public documents published by other parties and internal documents 
accessible by the searching party in response to a single search without requiring a the 
publisher of the private documents to publicly release the content of the private documents. 

15 In one embodiment, the search report output by the portal appliance includes 

advertising that is based on the search performed. The search terms can be used to 
determine the advertising to be displayed to the searching party. Advertising can also be 
provided by the portal appliance based on the search terms or other information, for 
example, an internal user profile. By providing information for selection of advertising and 

20 displaying the advertising the organization controlling the portal appliance can receive 
advertising revenue. 

Figure 5 is one embodiment of a flow diagram of a process for capturing public 
content at predetermined times for unconscious capture. In one embodiment, the portal 
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appliance can retrieve information at predetermined times. The retrieved information can 
be used, for example, to populate a database with information provided but not archived by 
a portal (e.g., stock quotes, press releases, news). The portal appliance retrieves the 
information and the FMA causes the information to be captured and archived. The 
archived information can the be accessed and/or searched at a later time. 

The portal appliance waits for a predetermined time for retrieving information, 510. 
Predetermined content is retrieved in response to a request at the predetermined time, 520. 
The content can be retrieved by the portal appliance "pulling" the content, for example, in 
the form of one or more HTTP requests initiated by the portal appliance. The content can 
also be "pushed" by an external portal to the portal appliance, for example, with an HTTP 
or FTP operation. The content is captured by the portal appliance, 530. The content is 
archived by, for example, the FMA, 440. If additional retrievals are scheduled, 550, the 
process is repeated. 

In the foregoing specification, the invention has been described with reference to 
specific embodiments thereof. It will, however, be evident that various modifications and 
changes can be made thereto without departing from the broader spirit and scope of the 
invention. The specification and drawings are, accordingly, to be regarded in an 
illustrative rather than a restrictive sense. 
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CLAIMS 

What is claimed is: 

1 LA method comprising: 

2 generating, automatically with an electronic system, a first search request in 

3 response to an original search request, the first search request to cause a search to be 

4 performed on electronic documents unconsciously captured by a local network device, 

5 the search of the electronic documents unconsciously captured to be performed according 

6 to search parameters of the original search request; and 

7 generating, automatically with the electronic system, a second search request in 

8 response to the original search request, the second search request to cause a search to be 

9 performed on electronic documents available via a network portal of an external network 
10 according to the search parameters of the original search request. 

1 2. The method of claim 1 wherein the local network device comprises a file 

2 management appliance. 

1 3. The method of claim 2 wherein the file management appliance generates 

2 the first search request and the second search request. 

1 4. The method of claim 2 wherein the file management appliance performs a 

2 search of the unconsciously captured electronic documents in response to the first search 

3 request. 
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1 5. The method of claim 2 wherein an Internet portal performs a search of the 

2 electronic documents available via a network portal of an external network in response to 

3 the second search request. 

1 6. The method of claim 1 wherein the first search request and the second 



2 search request are generated by a portal appliance in response to the original search 

3 request. 



1 7. The method of claim 1 further comprising generating a search report based 

2 on results from the first search request and the second search request. 

1 8. The method of claim 7 wherein the search report is a Hypertext Markup 

2 Language (HTML) document. 

1 9. The method of claim 7 wherein the search report is an Extensible Markup 

2 Language (XML) document. 

1 10. The method of claim 7 wherein the search report comprises an 

2 advertisement selected based on the first search request. 

1 11. The method of claim 7 wherein the search report comprises an 

2 advertisement selected based on analysis of documents indicated by search results. 



074451. PI 07 



-22- 



1 12. The method of claim 1 further comprising generating a third search 

2 request in response to the original search request, the third search request to cause a 

3 search to be performed on electronic documents available via a second network portal of 

4 the external network according to the search parameters of the original search request. 

1 1 3. A machine readable medium having stored thereon sequences of 

2 instructions that, when executed by one or more processors, cause one or more electronic 

3 devices to: 

4 generate, automatically, a first search request in response to an original search 

5 request, the first search request to cause a search to be performed on electronic 

6 documents unconsciously captured by a local network device according to search 

7 parameters of the original search request; and 

8 generate, automatically, a second search request in response to the original search 

9 request, the second search request to cause a search to be performed on electronic 

10 documents available via a network portal of an external network according to the search 

1 1 parameters of the original search request. 

1 14. The machine readable medium of claim 13 wherein the local network 

2 device comprises a file management appliance. 

1 15. The machine readable medium of claim 14 wherein the file management 

2 appliance generates the first search request and the second search request. 
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1 16. The machine readable medium of claim 14 wherein the file management 

2 appliance performs a search of the unconsciously captured electronic documents in 

3 response to the first search request. 



1 17. The machine readable medium of claim 14 wherein an Internet portal 

2 performs a search of the electronic documents available via a network portal of an 

3 external network in response to the second search request. 

1 18. The machine readable medium of claim 13 wherein the first search request 

2 and the second search request are generated by a portal appliance in response to the 

3 original search request. 

1 19. The machine readable medium of claim 13 further comprising generating 



2 a search report based on results from the first search request and the second search 

3 request. 



1 20. The machine readable medium of claim 19 wherein the search report is a 

2 Hypertext Markup Language (HTML) document. 

1 21 . The machine readable medium of claim 19 wherein the search report is an 

2 Extensible Markup Language (XML) document. 
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1 22. The machine readable medium of claim 19 wherein the search report 

2 comprises an advertisement selected based on the first search request. 

1 23. The machine readable medium of claim 19 wherein the search report 

2 comprises an advertisement selected based on analysis of documents indicated by search 

3 results. 



1 24. The machine readable medium of claim 13 further comprising generating 

2 a third search request in response to the original search request, the third search request to 

3 cause a search to be performed on electronic documents available via a second network 

4 portal of an external network according to the search parameters of the original search 

5 request. 

1 25. An apparatus comprising: 

2 a device to automatically capture electronic documents from the network; and 

3 an application to search the captured electronic documents in response to a search 

4 request, wherein the application also generates an external document search request in 

5 response to the search request, the external document search request to generate a search 

6 of electronic documents from an external network. 

1 26. The apparatus of claim 25 wherein the application is executed by the 

2 device. 
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1 27. The apparatus of claim 25 wherein the application is executed by a second 

2 device coupled to the device. 

1 29. The apparatus of claim 25 wherein the search of captured electronic 

2 documents is performed by the second device. 

1 30. The apparatus of claim 25 wherein the external document search is 

2 performed by an Internet portal. 

1 31. The apparatus of claim 25 wherein the search of captured electronic 

2 documents is performed by the device. 

1 32. A method comprising: 

2 generating a search request to retrieve predetermined information from a network 

3 portal at one or more predetermined times; 

4 capturing, unconsciously, the predetermined information in response to the 

5 predetermined information being retrieved; and 

6 archiving the captured predetermined information. 

1 33. An apparatus comprising: 

2 means for generating a search request to retrieve predetermined information from 

3 a network portal at one or more predetermined times; 
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4 means for capturing, unconsciously, the predetermined information in response to 

5 the predetermined information being retrieved; and 

6 means for archiving the captured predetermined information. 

1 34 A machine readable medium having stored thereon sequences of 

2 instructions that, when executed by one or more processors, cause one or more electronic 

3 devices to: 

4 generate a search request to retrieve predetermined information from a network 

5 portal at one or more predetermined times; 

6 capture, unconsciously, the predetermined information in response to the 

7 predetermined information being retrieved; and 

8 archive the captured predetermined information. 
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ABSTRACT 

Methods and apparatuses for both public documents and private documents in 
response to a single search request are disclosed. Public documents are electronic 
documents that are made available to large groups of people in by the publisher of the 
5 document. An example of a public document is a World Wide Web page. Private 

documents are documents that have restricted access. An example of a private document 
is a document that generated by members of an organization and is available only to 
members of the organization. As described in greater detail below, a portal appliance or 
other device can be used to search both public and authorized private electronic 
10 documents in response to a single search request thereby improving the results of the 
search and/or reducing the number of searches required to find the desired material. 
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Attorney's Docket No.: 74451 .P1 07 Patent 
DECLARATION AND POWER OF ATTORNEY FOR PATENT APPLICATION 



As a below named inventor, I hereby declare that: 

My residence, post office address and citizenship are as stated below, next to my name. 

I believe I am the original, first, and sole inventor (if only one name is listed below) or an original, 
first, and joint inventor (if plural names are listed below) of the subject matter which is claimed and 
for which a patent is sought on the invention entitled 

"METHODS AND APPARATUSES FOR SEARCHING BOTH EXTERNAL PUBLIC 

DOCUMENTS AND INTERNAL PRIVATE DOCUMENTS IN RESPONSE TO A SINGLE SEARCH 
REQUEST" 



the specification of which 

X is attached hereto. 

was filed on as 

United States Application Number 

or PCT International Application Number 

and was amended on . 

(if applicable) 

I hereby state that I have reviewed and understand the contents of the above-identified 
specification, including the claim(s), as amended by any amendment referred to above. I do not 
know and do not believe that the claimed invention was ever known or used in the United States of 
America before my invention thereof, or patented or described in any printed publication in any 
country before my invention thereof or more than one year prior to this application, that the same 
was not in public use or on sale in the United States of America more than one year prior to this 
application, and that the invention has not been patented or made the subject of an inventor's 
certificate issued before the date of this application in any country foreign to the United States of 
America on an application filed by me or my legal representatives or assigns more than twelve 
months (for a utility patent application) or six months (for a design patent application) prior to this 
application. 

I acknowledge the duty to disclose all information known to me to be material to patentability as 
defined in Title 37, Code of Federal Regulations, Section 1.56. 

I hereby claim foreign priority benefits under Title 35, United States Code, Section 119(a)-(d), of any 
foreign application(s) for patent or inventor's certificate listed below and have also identified below 
any foreign application for patent or inventor's certificate having a filing date before that of the 
application on which priority is claimed: 
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Prior Foreign Application(s) 



Priority 
Claimed 



(Number) 


(Country) 


(Day/Month/Year Filed) 


Yes 


(Number) 


(Country) 


(Day/Month/Year Filed) 


Yes 


(Number) 


(Country) 


(Day/Month/Year Filed) 


Yes 



I hereby claim the benefit under title 35, United States Code, Section 1 1 9(e) of any United States 
provisional application(s) listed below: 



(Application Number) Filing Date 



(Application Number) Filing Date 



I hereby claim the benefit under Title 35, United States Code, Section 120 of any United States 
application(s) listed below and, insofar as the subject matter of each of the claims of this application 
is not disclosed in the prior United States application in the manner provided by the first paragraph 
of Title 35, United States Code, Section 1 12, 1 acknowledge the duty to disclose all information 
known to me to be material to patentability as defined in Title 37, Code of Federal Regulations, 
Section 1 .56 which became available between the filing date of the prior application and the national 
or PCT international filing date of this application: 



(Application Number) Filing Date (Status -- patented, 

pending, abandoned) 



(Application Number) Filing Date (Status -- patented, 

pending, abandoned) 

I hereby appoint the persons listed on Appendix A hereto (which is incorporated by reference and a 
part of this document) as my respective patent attorneys and patent agents, with full power of 
substitution and revocation, to prosecute this application and to transact all business in the Patent 
and Trademark Office connected herewith. 

Send correspondence to Michael J, Mallie , BLAKELY, SOKOLOFF, TAYLOR & 

(Name of Attorney or Agent) 
ZAFMAN LLP, 12400 Wilshire Boulevard 7th Floor, Los Angeles, California 90025 and direct 

telephone calls to Michael J. Mallie , (408) 720-8598. 

(Name of Attorney or Agent) 
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I hereby declare that all statements made herein of my own knowledge are true and that all 
statements made on information and belief are believed to be true; and further that these 
statements were made with the knowledge that willful false statements and the like so made 
are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United 
States Code and that such willful false statements may jeopardize the validity of the 
application or any patent issued thereon. 

Full Name of Sole/First Inventor Kurt W. Piersol 



Inventor's Signature Date 



Residence Citizenship 

(City, State) (Country) 

Post Office Address 



Full Name of Second/Joint Inventor Jamev Graham 



Inventor's Signature Date 

Residence Citizenship . 



(City, State) (Country) 
Post Office Address 



Full Name of Third/Joint Inventor 



Inventor's Signature Date 

Residence Citizenship . 



(City, State) (Country) 
Post Office Address 



Full Name of Fourth/Joint Inventor 



Inventor's Signature Date 

Residence Citizenship . 



(City, State) (Country) 
Post Office Address 
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William E. Alford, Reg. No. 37,764; Farzad E. Amini, Reg. No. P42,261 ; Aloysius T. C. AuYeung, Reg. No. 
35,432; William Thomas Babbitt, Reg. No. 39,591; Carol F. Barry, Reg. No. 41,600; Jordan Michael 
Becker, Reg. No. 39,602; Bradley J. Bereznak, Reg. No. 33,474; Michael A. Bernadicou, Reg. No. 35,934; 
Roger W. Blakely, Jr., Reg. No. 25,831; Gregory D. Caldwell, Reg. No. 39,926; Ronald C. Card, Reg. No. 
P44,587; Thomas M. Coester, Reg. No. 39,637; Stephen M. De Klerk, under 37 C.F.R. § 10.9(b); Michael 
Anthony DeSanctis, Reg. No. 39,957; Daniel M. De Vos, Reg. No. 37,813; Robert Andrew Diehl, Reg. No. 
40,992; Matthew C. Fagan, Reg. No. 37,542; Tarek N. Fahmi, Reg. No. 41,402; James Y. Go, Reg. No. 
40,621; James A. Henry, Reg. No. 41,064; Willmore F. Holbrow III, Reg. No. P41,845; Sheryl Sue 
Holloway, Reg. No. 37,850; George W Hoover II, Reg. No. 32,992; Eric S. Hyman, Reg. No. 30,139; Dag 
H. Johansen, Reg. No. 36,172; William W. Kidd, Reg. No. 31,772; Erica W. Kuo, Reg. No. 42,775; Michael 
J. Mallie, Reg. No. 36,591; Andre L Marais, under 37 C.F.R. § 10.9(b); Paul A. Mendonsa, Reg. No. 
42,879; Darren J. Milliken, Reg. 42,004; Lisa A. Norris, Reg. No. P44,976; Chun M. Ng, Reg. No. 36,878; 
Thien T. Nguyen, Reg. No. 43,835; Thinh V. Nguyen, Reg. No. 42,034; Dennis A. Nicholls, Reg. No. 
42,036; Kimberley G. Nobles, Reg. No. 38,255; Daniel E. Ovanezian, Reg. No. 41,236; Babak Redjaian, 
Reg. No. 42,096; William F. Ryann, Reg. 44,313; James H. Salter, Reg. No. 35,668; William W. Schaal, 
Reg. No. 39,018; James C. Scheller, Reg. No. 31,195; Jeffrey Sam Smith, Reg. No. 39,377; Maria 
McCormack Sobrino, Reg. No. 31 ,639; Stanley W. Sokoloff, Reg. No. 25,128; Judith A. Szepesi, Reg. No. 
39,393; Vincent P. Tassinari, Reg. No. 42,179; Edwin H. Taylor, Reg. No. 25,129; John F. Travis, Reg. 
No. 43,203; George G. C. Tseng, Reg. No. 41,355; Joseph A. Twarowski, Reg. No. 42,191; Lester J. 
Vincent, Reg. No. 31,460; Glenn E. Von Tersch, Reg. No. 41,364; John Patrick Ward, Reg. No. 40,216; 
Charles T. J. Weigell, Reg. No. 43,398; Kirk D. Williams, Reg. No. 42,229; James M. Wu, Reg. No. 
P45,241; Steven D. Yates, Reg. No. 42,242; Ben J. Yorks, Reg. No. 33,609; and Norman Zafman, Reg. 
No. 26,250; my patent attorneys, and Andrew C. Chen, Reg. No. 43,544; Justin M. Dillon, Reg. No. 
42,486; Paramita Ghosh, Reg. No. 42,806; and Sang Hui Kim, Reg. No. 40,450; my patent agents, of 
BLAKELY, SOKOLOFF, TAYLOR & ZAFMAN LLP, with offices located at 12400 Wilshire Boulevard, 7th 
Floor, Los Angeles, California 90025, telephone (310) 207-3800, and James R. Thein, Reg. No. 31,710, 
my patent attorney. 
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APPENDIX B 



Title 37, Code of Federal Regulations, Section 1 .56 
Duty to Disclose Information Material to Patentability 

(a) A patent by its very nature is affected with a public interest. The public interest is best served, 
and the most effective patent examination occurs when, at the time an application is being examined, the 
Office is aware of and evaluates the teachings of all information material to patentability. Each individual 
associated with the filing and prosecution of a patent application has a duty of candor and good faith in 
dealing with the Office, which includes a duty to disclose to the Office all information known to that individual 
to be material to patentability as defined in this section. The duty to disclosure information exists with respect 
to each pending claim until the claim is cancelled or withdrawn from consideration, or the application becomes 
abandoned. Information material to the patentability of a claim that is cancelled or withdrawn from 
consideration need not be submitted if the information is not material to the patentability of any claim 
remaining under consideration in the application. There is no duty to submit information which is not material 
to the patentability of any existing claim. The duty to disclosure all information known to be material to 
patentability is deemed to be satisfied if ail information known to be material to patentability of any claim 
issued in a patent was cited by the Office or submitted to the Office in the manner prescribed by §§1 ,97(b)-(d) 
and 1 .98. However, no patent will be granted on an application in connection with which fraud on the Office 
was practiced or attempted or the duty of disclosure was violated through bad faith or intentional misconduct. 
The Office encourages applicants to carefully examine: 

(1 ) Prior art cited in search reports of a foreign patent office in a counterpart application, and 

(2) The closest information over which individuals associated with the filing or prosecution of a 
patent application believe any pending claim patentably defines, to make sure that any material information 
contained therein is disclosed to the Office. 

(b) Under this section, information is material to patentability when it is not cumulative to 
information already of record or being made or record in the application, and 

(1) It establishes, by itself or in combination with other information, a prima facie case of 
unpatentability of a claim; or 

(2) It refutes, or is inconsistent with, a position the applicant takes in: 

(i) Opposing an argument of unpatentability relied on by the Office, or 

(ii) Asserting an argument of patentability. 

A prima facie case of unpatentability is established when the information compels a conclusion that a claim is 
unpatentable under the preponderance of evidence, burden-of-proof standard, giving each term in the claim 
its broadest reasonable construction consistent with the specification, and before any consideration is given to 
evidence which may be submitted in an attempt to establish a contrary conclusion of patentability. 

(c) Individuals associated with /the filing or prosecution of a patent application within the 
meaning of this section are: 

(1 ) Each inventor named in the application; 

(2) Each attorney or agent who prepares or prosecutes the application; and 

(3) Every other person who is substantively involved in the preparation or prosecution of the 
application and who is associated with the inventor, with the assignee or with anyone to whom there is an 
obligation to assign the application. 

(d) Individuals other than the attorney, agent or inventor may comply with this section by 
disclosing information to the attorney, agent, or inventor. 
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